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Amendments to the Drawings: 

The attached sheets of drawings include changes to Figures 1-5 and new Figure 6. 
These sheets, which include Figs. 1-6, replace the original sheets including Figs. 1-5. 

Attachment: Replacement Sheets and New Sheet 
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REMARKS 

Claims 1-7, 1 1-13, 15-18, 27-29 and 31-33 are pending in the application. Claims 
8-10, 14, 19-26, 30 and 34 are canceled. Claims 1, 5, 6, 16-18 and 27 are currently amended. 
No new matter is introduced. 

Figures 1-5 have been amended to correct informalities and 5 sheets of 
Replacement Drawings and 1 new sheet including new Figure 6 are presented herewith for 
approval. 

This Amendment is supported by the concurrently filed Declaration of Dr. 
Kaushik Saha and Mr. Srijib Narayan Maiti (hereinafter "Inventor Decl."). Dr. Saha and Mr. 
Maiti are experts in the field of signal processing. Inventor Decl., ^ 1-2, Appendixes 1 and 2. 
Dr. Saha and Mr. Maiti have reviewed the present application, the Office Action mailed on 
February 1, 2010, and the references cited therein. Inventor Decl., ^ 3. 
The Drawings Are Sufficient 

The Examiner objected to the drawings under 35 CFR Section 1 .83(a) for failing 
to show all the limitations of the claims. Specifically, the Examiner contends the figures do not 
show the "structure of computing/performing N-point FFT/IFFT of the signal using first and 
second stages wherein the second stage employs single, un-nested computational loop," of the 
independent claims. The Examiner's objections are respectfully traversed. 

The independent claims do not use the exact language about which the Examiner 
complains. Each independent claim will be addressed in turn. 

Independent claim 1, as amended, recites, "[a] method of processing a digital 
signal by computing a Fast Fourier Transform (FFT) or Inverse Fast Fourier transform (IFFT) of 
the digital signal, the method comprising using a multiprocessor computing system having a 
plurality of processors P configured to perform the steps of: computing an N-point FFT/IFFT of 
the signal using first and second sets of butterfly computational stages, each stage in the second 
set of stages employing a plurality of butterfly operations, wherein each of the butterfly 
operations in each stage in the second set of stages has a single, un-nested computation loop; and 
distributing the plurality of butterfly operations in each stage of the second set of stages such that 
each processor computes an equal number of complete butterfly operations thereby eliminating 
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data interdependency among the parallel processors." It is unclear what the Examiner means by 
the contention that the figures do not show the structure of computing. To the extent the 
Examiner is referring to the multiprocessor system, an embodiment of such a system is shown in 
Figure 4, where a multiprocessing system comprises clusters 12 (i.e., processors, see 
Specification at 7, line 24 to 8, line 25). Figure 5 contains more detail on an embodiment of a 
cluster of Figure 4. To the extent the Examiner is referring to the recited method steps, Figures 2 
and 3 illustrate implementations of embodiments of the method. Figure 2 shows a two-processor 
implementation and Figure 3 shows a four processor implementation, where different line styles 
represent computation in each of the processors. See Page 6, line 25 to 7, line 22. See also 
Inventor Decl., f 5. To the extent the Examiner's contention is that no figure shows "a single, 
un-nested computation loop," Applicants disagree that any additional figure is needed, however 
to expedite prosecution new Figure 6 has been added. No new matter is introduced. See original 
claim 1. 

Independent claim 3 recites, "[a] multiprocessor system to process a digital signal 
by computing a Fast Fourier Transform (FFT) or Inverse Fast Fourier transform (IFFT) of the 
signal using a decimation in time or decimation in frequency approach, comprising: the 
multiprocessor system having a plurality of processors P and configured to implement: means for 
computing a first plurality of log2P stages of an N-point FFT/IFFT of the signal; means for 
computing a second plurality of stages of the N-point FFT/IFFT of the signal using in each stage 
of the second plurality of stages a plurality of butterfly operations, wherein each butterfly 
operation employs a single butterfly computation loop without employing nested loops; and 
means for distributing the butterfly operations in each stage of the second plurality of stages such 
that each processor computes an equal number of complete butterfly operations thereby 
eliminating data interdependency in the stages of the second plurality of stages." It is unclear 
what the Examiner means by the contention that the figures do not show the structure of 
computing. To the extent the Examiner is referring to the multiprocessor system, an embodiment 
of such a system is shown in Figure 4, where a multiprocessing system comprises clusters 12 
(i.e., processors, see Specification at 7, line 24 to 8, line 25). Figure 5 contains more detail on an 
embodiment of a cluster of Figure 4. To the extent the Examiner is referring to the recited 
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functions, Figures 2 and 3 illustrate embodiments of the implementation of the functions. Figure 
2 shows a two-processor implementation and Figure 3 shows a four processor implementation, 
where different line styles represent computation in each of the processors. See Page 6, line 25 
to 7, line 22. See also Inventor Decl., f 6. To the extent the Examiner's contention is that no 
figure shows "a single butterfly computational loop without employing nested loops," Applicants 
disagree that any additional figure is needed, however to expedite prosecution, new Figure 6 has 
been added. No new matter is introduced. See original claim 1 . 

Independent claim 5, as amended, recites, "[a] non-transitory computer-readable 
storage medium whose contents cause a system having a plurality of processors to perform a 
linear scalable method of transforming a signal by computing with the plurality of processors a 
Fast Fourier Transform (FFT) or an Inverse Fast Fourier Transform (IFFT) of the signal, the 
method comprising: computing a first plurality of stages of an N-point FFT/IFFT; and computing 
a second plurality of stages of the N-point FFT without employing nested loops and by 
distributing the butterfly operations in each stage of the second plurality of stages such that each 
processor computes an equal number of complete butterfly operations thereby eliminating data 
interdependency in the stage." It is unclear what the Examiner means by the contention that the 
figures do not show the structure of computing. To the extent the Examiner is referring to a 
system having a plurality of processors, an embodiment of such a system is shown in Figure 4, 
where a multiprocessing system comprises clusters 12 (i.e., processors, see Specification at 7, 
line 24 to 8, line 25). Figure 5 contains more detail on an embodiment of a cluster of Figure 4. 
To the extent the Examiner is referring to the recited method, Figures 2 and 3 illustrate the 
implementation of embodiments of the method. Figure 2 shows a two-processor implementation 
and Figure 3 shows a four processor implementation, where different line styles represent 
computation in each of the processors. See Page 6, line 25 to 7, line 22. See also Inventor Decl., 
| 7. To the extent the Examiner's contention is that no figure shows "computing a second 
plurality of stages of the N-point FFT without employing nested loops," Applicants disagree that 
any additional figure is needed, however to expedite prosecution, new Figure 6 has been added. 
No new matter is introduced. See original claim 1 . 
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Independent claim 16, as amended, recites, "[a] non-transitory computer-readable 
storage medium whose contents cause a system having a plurality of processors to perform a 
linear scalable method of transforming a signal, the method comprising: computing an N-point 
FFT/IFFT using a first plurality of butterfly computational stages and a second plurality of 
butterfly computational stages, each stage in the second plurality of stages employing a plurality 
of butterfly operations having a single, un-nested computation loop; and distributing the plurality 
of butterfly operations in each stage of the second plurality of stages such that each processor 
computes an equal number of complete butterfly operations thereby eliminating data 
interdependency in the stage." It is unclear what the Examiner means by the contention that the 
figures do not show the structure of computing. To the extent the Examiner is referring to a 
system having a plurality of processors, an embodiment of such a system is shown in Figure 4, 
where a multiprocessing system comprises clusters 12 (i.e., processors, see Specification at 7, 
line 24 to 8, line 25). Figure 5 contains more detail on an embodiment of a cluster of Figure 4. 
To the extent the Examiner is referring to the recited method, Figures 2 and 3 illustrate the 
implementation of embodiments of the method. Figure 2 shows a two-processor implementation 
and Figure 3 shows a four processor implementation, where different line styles represent 
computation in each of the processors. See Page 6, line 25 to 7, line 22. See also Inventor Decl., 
If 8. To the extent the Examiner's contention is that no figure shows "each stage in the second 
plurality of stages employing a plurality of butterfly operations having a single, un-nested 
computation loop," Applicants disagree that any additional figure is needed, however to expedite 
prosecution, new Figure 6 has been added. No new matter is introduced. See original claim 1 . 

Independent claim 27, as amended, recites, "[a] method of transforming a digital 
signal, the method comprising: using a multiprocessor computing system having a plurality P of 
processors configured to: compute a first number of butterfly stages of an N-point Fast Fourier 
Transform (FFT) or Inverse Fast Fourier transform (IFFT); and compute remaining butterfly 
stages of the N-point FFT/IFFT with a single iterative loop wherein each processor computes an 
equal number of butterfly operations and there is no data dependency between butterflies in a 
stage of an iteration of the loop." It is unclear what the Examiner means by the contention that 
the figures do not show the structure of computing. To the extent the Examiner is referring to a 
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multiprocessor computing system having a plurality of processors, an embodiment of such a 
system is shown in Figure 4, where a multiprocessing system comprises clusters 12 (i.e., 
processors, see Specification at 7, line 24 to 8, line 25). Figure 5 contains more detail on an 
embodiment of a cluster of Figure 4. To the extent the Examiner is referring to the recited 
method, Figures 2 and 3 illustrate the implementation of embodiments of the method. Figure 2 
shows a two-processor implementation and Figure 3 shows a four processor implementation, 
where different line styles represent computation in each of the processors. See Page 6, line 25 
to 7, line 22. See also Inventor Decl., U 9. To the extent the Examiner's contention is that no 
figure shows "compute remaining butterfly stages of the N-point FFT/IFFT with a single 
iterative loop," Applicants disagree that any additional figure is needed, however to expedite 
prosecution, new Figure 6 has been added. No new matter is introduced. See original claim 1 . 

Independent claim 31 recites, "[a] system, comprising: an instruction fetch cache; 
and a plurality of processors P coupled to the instruction fetch catch and configured to: compute 
a first number of butterfly stages of an N-point Fast Fourier Transform (FFT) or Inverse Fast 
Fourier Transform (IFFT) of a digital signal; and compute remaining butterfly stages of the N- 
point FFT/IFFT with a single iterative loop wherein there is no data dependency between 
butterflies in a stage of an iteration of the loop." It is unclear what the Examiner means by the 
contention that the figures do not show the structure of computing. To the extent the Examiner is 
referring to a system comprising an instruction fetch cache and a plurality of processors, an 
embodiment of such a system is shown in Figure 4, where a multiprocessing system comprises 
clusters 12 (i.e., processors, see Specification at 7, line 24 to 8, line 25). Figure 5 contains more 
detail on an embodiment of a cluster of Figure 4. To the extent the Examiner is referring to the 
configuration of the plurality of processors, Figures 2-5 illustrate the implementation of 
embodiments. Figure 2 shows a two-processor implementation, Figure 3 shows a four processor 
implementation, where different line styles represent computation in each of the processors. 
Figure 4 shows an N processor implementation. See Page 6, line 25 to 7, line 22. See also 
Inventor Decl., TJ 10. To the extent the Examiner's contention is that no figure shows "compute 
remaining butterfly stages of the N-point FFT/IFFT with a single iterative loop," Applicants 
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disagree that any additional figure is needed, however to expedite prosecution, new Figure 6 has 
been added. No new matter is introduced. See original claim 1 . 
The Claims Are Enabled by the Specification 

The Examiner rejected claims 1-7, 1 1-13, 15-18, 27-29 and 31-33 under 35 USC 
Section 1 12, first paragraph, as not enabled. The Examiner's rejections are respectfully 
traversed. The Examiner presents claim 1 as an example and contends that "first and second sets 
of butterfly computational stages ... wherein the second stage employs a single un-nested 
computational loop," was never fully addressed in the summary or the detail in such a way that 
one of skill in the art would be able to make or use the invention. 

There is a strong presumption that the specification contains an adequate 
disclosure as filed and the Examiner has the burden of presenting evidence or reasons why one of 
skill in the art would not recognize that the written description provides support for the claims. 
MPEP 2163(11). Further, there is no in haec verba requirement. MPEP 2163. For computer 
related claims, a disclosure of a processor capable of performing a function and the function is 
generally sufficient to enable the claims. See, e.g., Fonar Corp. v. General Electric Corp., 107 
F.3d 1543, 1549 (Fed. Cir. 1997) ("As a general rule, where software constitutes part of a best 
mode of carrying out an invention, description of such a best mode is satisfied by a disclosure of 
the function of the software. This is because, normally, writing code for such software is within 
the skill of the art, not requiring undue experimentation, once its functions have been disclosed. 
. . . Thus, flow charts or source code listings are not a requirement for adequately disclosing the 
functions of software."); In re Hayes Microcomputer Prods. Lit, 982 F.2d 1527, 1534-35 (Fed. 
Cir. 1992) (One skilled in the art would know how to program a microprocessor to perform the 
necessary steps described in the specification. Thus, an inventor is not required to describe every 
detail of his invention. An applicant's disclosure obligation varies according to the art to which 
the invention pertains. Disclosing a microprocessor capable of performing certain functions is 
sufficient to satisfy the requirement of section 1 12, first paragraph, when one of skill in the 
relevant art would understand what is intended and know how to carry it out."). 

Here, the Examiner's complaint appears to be that the specification is concise, and 
possibly terse. However, that is not the test for enablement. As an initial matter, one of skill in 
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the art would have understood the difference between nested and un-nested loops in general. 
Inventor Decl., ^ 12. Thus, one of skill in the art would understand what was meant by an un- 
nested loop. Figures 2 and 3 show multiple stages of an N-point FFT/IFFT implemented on 
multiprocessor systems, and the description thereof (see pages 6-8) as well as original claim 2 
describe how to implement at least one embodiment of distributing the remaining stages of the 
butterfly operations among the processors in an un-nested loop (for example, "by assigning to 
each processor of the multi-processor system respective addresses of memory locations 
corresponding to inputs and outputs required for each specific butterfly operation assigned to the 
processor"). See Inventor Decl., f 12. This is sufficient to satisfy the enablement requirement. 
Thus, claim 1 is enabled. The Examiner has the burden of showing lack of enablement in the 
context of the language of the claims, and the Examiner has not explained why claim 1 or the 
other claims are not enabled by the specification. Nevertheless, the remaining claims are enabled 
for reasons similar to those set forth above with respect to claim 1 . Accordingly, all the pending 
claims are enabled by the specification. Applicants note that original claims 1 and 2 have been 
copied into the specification. No new matter has been introduced. See original claims 1 and 2. 

The Claims Are Sufficiently Definite 

The Examiner rejected claims 5, 6 and 16-18 under 35 USC Section 1 12, second 
paragraph, as indefinite. Specifically, the Examiner contends "computer-readable storage 
medium" is unclear because the specification does not fully address the definition or type of the 
computer-readable storage medium. The Examiner's rejections are respectfully traversed. 

The Examiner suggests one of skill in the art would not know whether the 
medium was tangible or non-tangible. Just prior to the Examiner's rejection, the Under 
Secretary of Commerce issued a Notice on January 26, 2010. While the Notice addresses 
rejections of computer readable medium claims under Section 101, the Notice is nevertheless 
instructive, as it addresses both the ordinary and customary meaning of "computer readable 
medium" and indicates the use of "non-transitory" would not ordinarily raise new matter 
concerns. Claims 5, 6 and 16-18 have been amended as suggested by the Notice to specify that 
the computer readable storage medium are "non-transitory." The Notice indicates that one of 
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skill in the art would know the ordinary and customary meaning of a non-transitory computer 
readable medium, and this is not a case where the specification was limited to a signal per se. 
See page 5, lines 17-21 (disclosing a computer readable storage medium). Accordingly, claims 
5, 6 and 16-18 as amended are sufficiently definite, as one of skill in the art would understand 
transitory medium was excluded, and no new matter has been introduced by the amendment 
where the original specification, as recognized by the Examiner, covered non-transitory medium 
such as RAM, ROM, hard drives and CD-ROMS. It is noted that embodiments may employ 
non-transitory medium in combination with transitory medium. 

The Claims Are Directed to Statury Subject Matter 

The Examiner rejected claims 1, 2, 7, 1 1 and 27-29 under 35 USC Section 101, as 
directed to non- statutory subject matter. The Examiner admits the claims are tied to a 
multiprocessor system, but contends insufficient detail is provided about the multiprocessor 
system for it to be considered a specific machine or apparatus. The Examiner's rejections are 
respectfully traversed. While it is believed that independent claims 1 and 27 were directed to 
allowable subject matter prior to amendment, to expedite prosecution claims 1 and 27 have been 
amended to include the word "configured," which the Examiner appears to have found sufficient 
to obviate concerns as to whether claim 3 was directed to statutory subject matter. 

As previously argued, it is noted that during prosecution much of the Examiner's 
concerns centered on whether the claims were directed to statutory subject matter. The state of 
the law regarding whether computer-related claims are directed to statutory subject matter is an 
issue that has been in flux. Recently, new Interim Patent Subject Matter Eligibility Examination 
Instructions (the "Interim Instructions") were issued by the Patent Office, which set forth 
example formats for claiming that Applicants adopted and which Applicants believe obviate any 
concerns that the claims are directed to non-statutory subject matter. 

The Interim Instructions contrast five hypothetical computer-related claims, 
claims 2-6. Hypothetical claims 4 and 6 are not statutory, but hypothetical claims 2, 3 and 5 are 
statutory. The difference between non-statutory hypothetical claims on the one hand and the 
statutory hypothetical claims on the other hand is that the statutory claims recite a machine in 
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connection with a part of the claimed embodiment of the hypothetical claim that is more than 
mere insignificant extra-solution activity. For example, under the interim instructions 
hypothetical claim 4 is not statutory, but hypothetical claim 5 is statutory because under the 
broadest reasonable interpretation of hypothetical claim 5, the step performed using a 
microprocessor requires a particular programmed processor and the step is not mere insignificant 
extra-solution activity. Hypothetical claim 5 recites, "comparing, using a microprocessor." It is 
noted that hypothetical claim 5 does not provide any detail about the microprocessor other than 
to specify that a step is performed using the microprocessor. Nevertheless, the interim 
instructions indicate it is directed to statutory subject matter. 

Turning to the language of the claims at issue here, independent claim 1 , as 
amended, recites, "the method comprising using a multiprocessor computing system having a 
plurality of processors P configured to perform the steps of: [all of the claimed steps]." Claim 1 
is directed to statutory subject matter for the same reasons that hypothetical claim 5 of the 
Interim Instructions was directed to statutory subject matter. Under the broadest reasonable 
interpretation, the multiprocessor computing system must be configured or programmed in a 
particular manner to perform all of the recited method steps. Thus, a particular machine is 
recited, the multiprocessor computing system. In addition the machine imposes a meaningful 
limitation and is more that an insignificant extra-solution activity because all of the method steps 
are performed by the claimed multiprocessor computing system, and thus at least one of the steps 
must be more than mere insignificant extra-solution activity. Further, claim 1 provides more 
detail than hypothetical claim 5, because it specifies "a multiprocessor computing system having 
a plurality of processors P." Thus claim 1 is directed to statutory subject matter. Claims 2, 7 and 
1 1 depend from claim 1 and are directed to statutory subject matter at least by virtue of their 
dependencies. 

Independent claim 27 recites "using a multiprocessor computing system having a 
plurality P of processors to: [all of the recited method steps.]" Under the broadest reasonable 
interpretation, the multiprocessor computing system must be configured or programmed in a 
particular manner to perform all of the recited method steps. Thus, a particular machine is 
recited, the multiprocessor computing system. In addition the machine imposes a meaningful 
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limitation and is more that an insignificant extra-solution activity because all of the method steps 
are performed by the claimed multiprocessor computing system, and thus at least one of the steps 
must be more than mere insignificant extra-solution activity. Thus claim 27 is directed to 
statutory subject matter. Claims 28 and 29 depend from claim 27 and are directed to statutory 
subject matter at least by virtue of their dependencies. 

Abel, Alone or in Combination with Jaber, Does Not Render the Claims Obvious 

The Examiner rejected claims 1-7, 1 1-13, 15-18, 27-29 and 31-33 under 35 USC 
Section 103(a) as obvious over U.S. Patent No. 5,991,787 issued to Abel et al., in view of U.S. 
Patent No. 6,792,441 issued to Jaber. The Examiner's rejections are respectfully traversed. 

The Examiner appears to rely on Jaber as allegedly teaching the distributing of the 
butterfly operations. The operation of Jaber for a 16-point FFT distributed among 4 processors is 
described below with reference to the some of the figures of Jaber, and is contrasted with the 
operation of the present disclosure. This discussion is illustrative purposes only, and can be 
generalized to larger FFTs (size N), as well as to different numbers of processors. Inventor 
Decl.,1jl4. 
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Fig 10 of US6792441 (Jaber et al) 

With reference to Figure 10 of Jaber, reproduced above, Jaber teaches that the set 
of input data points (N=16) is divided among 4 processors (P=4) such that each processor works 
on 4 data points (i.e. executes 2 butterfly computations) in each stage of an FFT. All 4 
processors execute this task in parallel without the need to send or receive data to or from the 
other parallel processor, for the first two stages of FFT. This is followed by a combination phase 
which needs the outputs (4x4=16) of all these processors and produces 16 outputs finally. The 
combination phase is comprised of the last 2 (log 2 P) stages of the FFT. Inventor Decl., | 15. 
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As a result butterfly computations involving input data points are assigned to the 
processors as follows: 



{f0, f4 5 f8, fl2} to first processor 
{fl, f5, f9, fl3} to second processor 
{£2, f6, flO, fl4} to third processor 
{f3, f7, fl 1, fl5} to fourth processor y 



Gl 



Inventor Decl., ^ 16. 



Until stage 2 of the FFT computation, all 4 processors execute their butterfly 
operations in parallel and independently of each other. Thereafter, stages 3 and 4 of the FFT 
computation are assigned to a combination phase. For the case of a 16-point FFT, the 
computations require 8 (N/2) twiddle factors/coefficients named, {WO, W7}. The first two 
stages require the coefficients WO and Wl only. The combination phase requires the entire set 
of coefficients {W0,...,W7}. This necessitates that the coefficients WO, Wl be accessible to all 
the 4 processors at the same time instant. This is evident from Fig 8 of Jaber, reproduced below, 
which shows a coefficient memory (804) being accessed by all the processors 807A,...807B. 
Inventor Decl., f 17. 
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Figure 8 of Jaber 

If we consider as a second example the case of a 32 point FFT (N=32) on 4 
parallel processors (P=4), input data points are assigned to the processors as follows: 



{f0, f4, f8, fl2, fl6, £20, £24, f28} to first processor 
{fl, f5, f9, fl3, fl7, f21, f25, f29} to second processor 
{f2, f6, AO, fl4, fl8, £22, £26, 00} to third processor 
{Q, f7, fl 1, fl5, fl9, £23, £27, f31 } to fourth processor 



> 



G2 



Inventor Decl., % 18. 

There are total 5 stages in this case. Until stage 3 of the FFT computation, all 4 
processors execute their butterfly operations in parallel and independently of each other. 
Thereafter, stages 4 and 5 of the FFT computation are assigned to a combination phase. In this 
case too, the combination phase is comprised of the last log2P = 2 stages of the 32 point FFT. 
Inventor DecL, If 19. 

In this case there are 16 twiddle coefficients {W0,...,W15}, out of which 
{W0,W1,W2,W3} are required to be accessible to all the 4 processors. The combination phase 
requires the entire set of coefficients { WO,.. .,W15}. Inventor Decl., f 20. 
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In the general case of an N point FFT being executed on P parallel processors, 
there would be a total of log 2 N stages, out of which the last log2p stages comprise the 
combination phase and the remaining (log 2 N- log 2 P) stages are computed in P processors in 
parallel. In the general case, there would be N/2 twiddle coefficients in all { WO,. . .,W N/2 -l }, out 



of which the identical N/(2P) number of twiddle coefficients need to be accessible to all the P 
parallel processors all the while. The combination phase requires access to all the N/2 twiddle 
coefficients. Inventor Decl., ^ 2 1 . 




^ CUTRfl tBOft | 

Figure 8 of Jaber Modified to Show Distribution of Twiddle Coefficients in Jaber 

Figure 8 of Jaber appears again above, modified to show the distribution of the 
twiddle coefficients from the coefficient memory as per the Jaber. Out of N/2 coefficients, N/2P 
coefficients are needed by all of the processors all of the time, and the entire set is needed by the 
combinational phase. Inventor Decl., ^ 22. 
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Butterfly Distribution of Present Disclosure for 4-Processor Configuration: N = 16; P = 4 



The butterfly distribution of the present disclosure is illustrated above for N = 16 
and P = 4. The set of input data points (N=16) is divided among 4 processors (P=4) such that 
each processor works on 4 data points (i.e. executes 2 butterfly computations) in each stage of 
the FFT. (A 16 point FFT is shown to facilitate a comparison with Jaber; Figure 3 of the present 
application shows an 8 point FFT on a 4 processor system). All 4 processors execute this task in 
parallel without the need to send or receive data to or from the other parallel processors, for the 
last two stages of the FFT. During the first log 2 P =2, there is data dependency between the 
parallel processors, i.e. there is need of data to/from each other in the first two stages. Therefore, 
parallel processing of butterfly computations starts after the first 2 stages. As shown in the above 
figure (stage 3, stage 4), the butterflies colored in red are computed in the first processor, the 
butterflies colored in blue are computed in the second processor, the butterflies colored in green 
are computed in the third processor, the butterflies colored in black are computed in the fourth 
processor. It can be seen from stage 4 that all the red colored butterflies take input data points 
only from red colored butterflies of the previous stage (i.e. stage 3), and so forth for the other 
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processors. Inventor Decl., 23. As a result butterfly computations involving input data points 
are assigned to the processors as follows: 

{ x(0),x(8),x( 1 ),x(9) } to first processor 
{x(4), x(12), x(5), x(13)} to second processor 
{x(2), x(10), x(3),x (1 1)} to third processor 
{x(6) ,x(14), x(7), x(15)} to fourth processor 

Inventor Decl., % 23. 

After stage 2 of the FFT computation, all 4 processor execute their butterfly 
operations in parallel and independently of each other until the final output. Prior to that, stages 1 
and 2 of the FFT computation have data dependency among the parallel processors. For the case 
of a 16-point FFT, the computations require 8 (=N/2) twiddle factors/coefficients {W°,...,W 7 }. 
Inventor Decl., K 24. As shown in the butterfly distribution illustrated on the previous page, the 
requirement of twiddle coefficients by the 4 processors is as follows: 

{w°, w 4 } to first processor 
{w 1 , w 5 } to second processor 
{w 2 , w 6 } to third processor 
{w 3 , w 7 } to fourth processor 

Inventor Decl., ^ 24. 

It may be noted that owing to the innovative scheme of distribution of butterfly 
computations among processors, the sets of twiddle coefficients required by different processors 
are disjoint and the number of coefficients in each set does not exceed 2. Inventor Decl., If 25. 
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If we consider as a second example the case of a 32 point FFT (N=32) on 4 
parallel processors (P=4), input data points are assigned to the processors as follows: 

{x(0),x(16),x(l),x(17), x(2),x(18),x(3),x(19)} to first processor — 
{x(8),x(24),x(9),x(25), x(10),x(26),x(l l),x(27)} to second processor 
{x(4),x(20),x(5),x(21), x(6),x(22),x(7),x(23)} to third processor G5 
{x(12),x(28),x(13),x(29), x(14),x(30),x(15),x(31)} to fourth processor^ 



Inventor Decl., Tf 26. 

After stage 2 of the FFT computation, all 4 processor execute their butterfly 
operations in parallel and independently of each other until the final output. Prior to that, stages 
1 and 2 of the FFT computation have data dependency among the parallel processors. For the 
case of a 32-point FFT, the computations require 16 (=N/2) twiddle factors/coefficients named 
{W°,...,W 15 }. The requirement of twiddle coefficients by the 4 processors is as follows: 



{w°, w 4 , w 8 , w 12 } to first processor 
{w 1 , w 5 , w 9 , w 13 } to second processor 
{w 2 , w 6 , w 10 , w 14 } to third processor 
{w 3 , w 7 ,w ! \ w 15 } to fourth processor 



G6 



Inventor Decl., \ 27. 

It may be noted again that owing to the innovative scheme of distribution 
butterfly computations among processors, the sets of twiddle coefficients required by different 
processors are disjoint and the number of coefficients in each set does not exceed 4. Inventor 
Decl., If 28. 

In the general case of an N point FFT being executed on P parallel processors, 
there would be a total of log2N stages, out of which the first log2P stages have data dependency 
between the parallel processors and the remaining (log2N- log2P) stages are computed in P 
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processors in parallel without need of data to/from each other. In the general case, there would 
be N/2 twiddle coefficients in all {W°,...,W N/:M }, out of which disjoint sets N/(2P) of twiddle 
coefficients are needed by each of the P parallel processors except when P=2. In this case also 
(P=2), one processor only needs all the twiddle coefficients, whereas the other one needs only a 
subset having N/(2P) twiddle coefficients. Inventor Decl., 29. 

The fundamental difference lies in the method of butterfly distribution among the 
parallel processor as shown in the assignments of butterflies among the parallel processors, Gl, 
G3 for 16-point FFT/IFFT and G2, G5 for 32-point FFT/IFFT. As a result of this, the stages 
where there are dependencies among the parallel processors are in the first log 2 P stages as per the 
present disclosure, whereas, the dependencies among the parallel processors are in the last log 2 P 
stages as per Jaber. Inventor Decl., ^ 30. 

Another manifestation of the present disclosure can be seen in the uses of twiddle 
coefficients in the different parallel processors. As per the present proposal, it may be noted that 
although all N/2 coefficients are used for the entire FFT computation, no parallel processor uses 
more than an N/2P subset of these, having no common coefficients with any other processors 
except when P=2. Whereas as per Jaber, an identical N/(2P) twiddle coefficients need to be 
accessible to all the P parallel processors all the while. The combination phase of Jaber requires 
access to all the N/2 twiddle coefficients. Figure 8 of Jaber is modified again on the following 
page to show how the twiddle coefficients would be applied if Jaber were modified in 
accordance with the disclosure of the present application. As can be seen, distinct N/2P subsets 
of the N/2 coefficients in the coefficient memory are used by different parallel processors. At any 
stage no processor has need of any coefficients which are required by any other processor. 
Please note that the present disclose is not limited in any way to application to the particular 
architecture of Jaber. It can be applicable to any system architecture in general, including shared 
and distributed memory systems. The use of shared coefficient memory taught in Jaber is 
rendered unnecessary in the present proposal due to the innovative method of distribution of 
butterflies among the parallel processors. Inventor Decl., 1 3 1 . 
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Figure 8 of Jaber Modified to Show Distribution of Twiddle Coefficients 
According to Present Disclosure 

Turning to the language of the claims, claim 1 as amended recites, "each stage in 
the second set of stages has a single, un-nested computation loop; and distributing the plurality of 
butterfly operations in each stage of the second set of stages such that each processor computes 
an equal number of complete butterfly operations thereby eliminating data interdependency 
among the parallel processors." Abel, alone or in combination with Jaber, does not teach or 
suggest, or otherwise render obvious, "distributing the plurality of butterfly operations in each 
stage of the second set of stages such that each processor computes an equal number of complete 
butterfly operations thereby eliminating data interdependency among the parallel processors," as 
recited. Inventor Decl., ^ 32. Claims 2, 7 and 1 1 depend from claim 1 and are allowable at 
least by virtue of their dependencies. 
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In response, the Examiner first mischaracterizes Applicants' argument. 
Applicants compared and contrasted the operation of Jaber and the present disclosure as 
background for its argument that Jaber did not teach a recited limitation as contended by the 
Examiner. For example, with respect to claim 1, that Jaber did not teach "distributing the 
plurality of butterfly operations in each stage of the second set of stages such that each processor 
computes an equal number of complete butterfly operations thereby eliminating data 
interdependency among the parallel processors," as contended by the Examiner. The Examiner 
also contends in response that because Jaber stores the coefficients in memory and retrieves the 
coefficients from memory there is no data dependency. This argument is incorrect because in 
Jaber, the coefficients are shared between the processors, as discussed above. Specifically, out 
of N/2 coefficients, N/2P coefficients are needed by all of the processors of Jaber all of the time, 
and the entire set is needed by the combinational phase. Thus, the stages of Jaber have data 
interdependencies among the processors. Inventor Decl., f 33. 

Independent claim 3 recites, "means for computing a second plurality of stages of 
the N-point FFT/IFFT of the signal using in each stage of the second plurality of stages a 
plurality of butterfly operations, wherein each butterfly operation employs a single butterfly 
computation loop without employing nested loops; and means for distributing the butterfly 
operations in each stage of the second plurality of stages such that each processor computes an 
equal number of complete butterfly operations thereby eliminating data interdependency in the 
stages of the second plurality of stages." Abel, alone or in combination with Jaber, does not 
teach or suggest, or otherwise render obvious, "means for distributing the butterfly operations in 
each stage of the second plurality of stages such that each processor computes an equal number 
of complete butterfly operations thereby eliminating data interdependency in the stages of the 
second plurality of stages " as recited. Inventor Decl., K 34. Claims 4, 12, 13 and 15 depend 
from claim 3 and are allowable at least by virtue of their dependencies. With regard to the 
Examiner's responses, it is noted that it was argued that a specific recited feature is missing from 
the cited combination of references, and, as discussed above with respect to claim 1 , the stages of 
Jaber have data interdependencies among the processors. 



30 



Application No. 10/727,138 

Reply to Office Action dated February 1, 2010 

Independent claim 5, as amended, recites, "computing a first plurality of stages of 
an N-point FFT/IFFT; and computing a second plurality of stages of the N-point FFT without 
employing nested loops and by distributing the butterfly operations in each stage of the second 
plurality of stages such that each processor computes an equal number of complete butterfly 
operations thereby eliminating data interdependency in the stage." Abel, alone or in combination 
with Jaber, does not teach or suggest, or otherwise render obvious, "computing a second plurality 
of stages of the N-point FFT without employing nested loops and by distributing the butterfly 
operations in each stage of the second plurality of stages such that each processor computes an 
equal number of complete butterfly operations thereby eliminating data interdependency in the 
stage," as recited. Inventor Decl., If 35. Claim 6 depends from claim 5 and is allowable at least 
by virtue of its dependencies. With regard to the Examiner's responses, it is noted that it was 
argued that a specific recited feature is missing from the cited combination of references, and, as 
discussed above with respect to claim 1, the stages of Jaber have data interdependencies among 
the processors. 

Independent claim 16, as amended, recites, "computing an N-point FFT/IFFT 
using a first plurality of butterfly computational stages and a second plurality of butterfly 
computational stages, each stage in the second plurality of stages employing a plurality of 
butterfly operations having a single, un-nested computation loop; and distributing the plurality of 
butterfly operations in each stage of the second plurality of stages such that each processor 
computes an equal number of complete butterfly operations thereby eliminating data 
interdependency in the stage." Abel, alone or in combination with Jaber, does not teach, suggest 
or otherwise render obvious, "distributing the plurality of butterfly operations in each stage of 
the second plurality of stages such that each processor computes an equal number of complete 
butterfly operations thereby eliminating data interdependency in the stage," as recited. Inventor 
Decl., TI 36. Claims 17 and 18 are allowable at least by virtue of their dependencies. With regard 
to the Examiner's responses, it is noted that it was argued that a specific recited feature is 
missing from the cited combination of references, and, as discussed above with respect to claim 
1, the stages of Jaber have data interdependencies among the processors. 
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Independent claim 27, as amended, recites, "compute a first number of butterfly 
stages of an N-point Fast Fourier Transform (FFT) or Inverse Fast Fourier transform (IFFT); and 
compute remaining butterfly stages of the N-point FFT/IFFT with a single iterative loop wherein 
each processor computes an equal number of butterfly operations and there is no data 
dependency between butterflies in a stage of an iteration of the loop." Abel, alone or in 
combination with Jaber, does not teach, suggest or otherwise render obvious, "compute 
remaining butterfly stages of the N-point FFT/IFFT with a single iterative loop wherein each 
processor computes an equal number of butterfly operations and there is no data dependency 
between butterflies in a stage of an iteration of the loop," as recited. Inventor Decl., ^ 37. 
Claims 28 and 29 are allowable at least be virtue of their dependencies. With regard to the 
Examiner's responses, it is noted that it was argued that a specific recited feature is missing from 
the cited combination of references, and, as discussed above with respect to claim 1, the stages of 
Jaber have data interdependencies among the processors. 

Independent claim 3 1 , as amended, recites, "compute a first number of butterfly 
stages of an N-point Fast Fourier Transform (FFT) or Inverse Fast Fourier Transform (IFFT) of a 
digital signal; and compute remaining butterfly stages of the N-point FFT/IFFT with a single 
iterative loop wherein there is no data dependency between butterflies in a stage of an iteration of 
the loop." Abel, alone or in combination with Jaber, does not teach, suggest or otherwise render 
obvious, "compute remaining butterfly stages of the N-point FFT/IFFT with a single iterative 
loop wherein there is no data dependency between butterflies in a stage of an iteration of the 
loop," as recited. Inventor Decl., U 38. Claims 32 and 33 are allowable at least by virtue of 
their dependencies. With regard to the Examiner's responses, it is noted that it was argued that a 
specific recited feature is missing from the cited combination of references, and, as discussed 
above with respect to claim 1 , the stages of Jaber have data interdependencies among the 
processors. 
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No New Matter Has Been Introduced 

The Examiner also objected to the amendment filed with the RCE of November 
1 1, 2009 as introducing new matter. Specifically, the Examiner contends each of the 
independent claims introduces new matter based on the limitations "first and second sets of 
butterfly computational stages . . . wherein the second stage employs single un-nested 
computational loop." The Examiner's objections are respectfully traversed. 

As an initial matter, none of the current independent claims contain the precise 
language objected to by the Examiner. Thus, it is difficult, if not impossible, for Applicants to 
ascertain what exactly the Examiner considers to be new matter in the independent claims, and 
the Examiner does not explain the basis for the objection. Nevertheless, Applicants will attempt 
to address each independent claim in turn. 

Independent claim 1 as originally filed appears below. 

1 . A linear scalable method for computing a Fast Fourier Transform (FFT) or 
Inverse Fast Fourier transform (IFFT) in a multiprocessing system using a decimation in time 
approach, comprising the steps of: 

computing first and second stages of log2N stages of an N-point FFT/IFFT as a 
single radix-4 butterfly operation while implementing the remaining (log2N-2) stages using 
radix-2 butterfly operations, wherein each radix-2 butterfly operation employs a single radix-2 
butterfly computation loop without employing nested loops; and 

distributing the butterfly operations in each stage such that each processor 
computes an equal number of complete butterfly operations thereby eliminating data 
interdependency in the stage. 

There is a strong presumption that the specification contains an adequate 
disclosure as filed and the Examiner has the burden of presenting evidence or reasons why one of 
skill in the art would not recognize that the written description provides support for the claims. 
MPEP 2163(11). Further, there is no in haec verba requirement. MPEP2163. Id Ifitisthe 
word "sets" to which the Examiner objects, claim 1 as originally filed clearly was addressed to 
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processing two sets of stages (the first set comprising the first and second stages of butterfly 
computational stages and the second set comprising the remaining stages), with the second set 
(the remaining stages) employing "a single . . . computational loop without employing nested 
loops." A "set" has a well-known meaning and one of skill in the art would have recognized that 
original claim 1 referred to sets of computational stages. Inventor Deck, 1} 42. Moreover, 
Figures 2 and 3 illustrates sets of computational stages and the description thereof on pages 6-7 
is clearly directed to handling the first two stages in one manner and the remaining stages in 
another. Id To the extent the Examiner is referring to "a single un-nested computational loop," 
this is almost identical to the original language "a single . . . computational loop without 
employing nested loops." Further, as discussed above, one of skill in the art would have 
recognized Figures 2 and 3 and the description thereof on pages 6 and 7 as disclosing 
"computing an N-point FFT/IFFT of the signal using first and second sets of butterfly 
computational stages, each set in the second set of stages employing a plurality of butterfly 
operations, wherein each of the butterfly operations in each stage in the second set of stages has a 
single, un-nested computational loop." Inventor Deck, f 43. Thus, it is not seen how claim 1 as 
amended introduces any new matter. To the extent the Examiner is really asserting an 
enablement rejection, this rejection is addressed above. 

Independent claim 3, as originally filed, appears below. 

3. A linear scalable system for computing a Fast Fourier Transform (FFT) or 
Inverse Fast Fourier transform (IFFT) in a multiprocessing system using a decimation in time 
approach, comprising: 

means for computing first and second stages of log 2 N stages of an N-point 
FFT/IFFT as a single radix-4 butterfly operation while implementing the remaining (log2N-2) 
stages using radix-2 butterfly operations, wherein each radix-2 butterfly operation employs a 
single radix-2 butterfly computation loop without employing nested loops; and 

means for distributing the butterfly operations in each stage such that each 
processor computes an equal number of complete butterfly operations thereby eliminating data 
interdependency in the stage. 
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As noted above, there is no in hac verba requirement. Claim 3 as previously 
amended does not use the word "sets." If it is "plurality of stages" to which the Examiner 
objects, it is respectfully submitted that "plurality" has a definite meaning in the claims of a 
patent, specifically, more than one, and one of skill in the art would have recognized both "first 
and second stages" as a first plurality of stages, and "remaining stages" as a second plurality of 
stages. Inventor Decl., f 46. If it is "employs a single . . . butterfly computational loop without 
employing nested loops," to which the Examiner objects, this language appears in claim 3 as 
originally filed. Thus, it is not seen how claim 3 as previously amended introduces any new 
matter. To the extent the Examiner is really asserting an enablement rejection, this rejection is 
addressed above. 

Independent claim 5, as originally filed, appears below. 

5. A computer program product comprising computer readable program code 
stored on a computer readable storage medium embodied therein for computing a Fast Fourier 
Transform (FFT) or Inverse Fast Fourier transform (IFFT) in a multiprocessing system using a 
decimation in time approach, comprising: 

computer readable program code means configured for computing computing first 
and second stages of log2N stages of an N-point FFT/IFFT as a single radix-4 butterfly operation 
while implementing the remaining (log2N-2) stages using radix-2 butterfly operations, wherein 
each radix-2 butterfly operation employs a single radix-2 butterfly computation loop without 
employing nested loops; and 

computer readable program code means configured for distributing the butterfly 
operations in each stage such that each processor computes an equal number of complete 
butterfly operations thereby eliminating data interdependency in the stage. 

As noted above, there is no in hac verba requirement. Claim 5 as amended does 
not use the word "sets." If it is "plurality of stages" to which the Examiner objects, it is 
respectfully submitted that "plurality" has a definite meaning in the claims of a patent, 
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specifically, more than one, and one of skill in the art would have recognized both "first and 
second stages" as a first plurality of stages, and "remaining stages" as a second plurality of 
stages. Inventor Decl., ^ 50. If it is "without employing nested loops," to which the Examiner 
objects, this language appears in claim 5 as originally filed. Thus, it is not seen how claim 5 as 
amended introduces any new matter. To the extent the Examiner is really asserting an 
enablement rejection, this rejection is addressed above. 

With regard to independent claim 16, to the extent the Examiner objects to the 
word "plurality," it is respectfully submitted that "plurality" has a definite meaning in the claims 
of a patent, specifically, more than one, and one of skill in the art would have recognized both 
"first and second stages" (see original claim 5) as a first plurality of stages, and "remaining 
stages" (see original claim 5) as a second plurality of stages. Inventor Decl., ^ 52. To the extent 
the Examiner objects to "having a single, un-nested computational loop," Applicants refer the 
Examiner to original claims 1, 3 and 5, and the discussion of claims 1, 3 and 5 above. Thus, it is 
not seen how claim 16 as amended introduces any new matter. To the extent the Examiner is 
really asserting an enablement rejection, this rejection is addressed above. 

With regard to independent claims 27 and 3 1, to the extent the Examiner objects 
to "a first number of butterfly stages" it is respectfully submitted that one of skill in the art would 
have recognized "first and second stages" (see original claim 5) as a first number of stages, and 
"remaining stages" (see original claim 5) as "remaining stages." Inventor Decl., | 53. To the 
extent the Examiner objects to "a single iterative loop," Applicants refer the Examiner to the 
specification, page 7, lines 8-13. Thus, it is not seen how claims 27 and 31 as amended introduce 
any new matter. To the extent the Examiner is really asserting enablement rejections, these 
rejections are addressed above. 

The Director is authorized to charge any additional fees due by way of this 
Amendment, or credit any overpayment, to our Deposit Account No. 19-1090. 



36 



Application No. 10/727,138 

Reply to Office Action dated February 1, 2010 



All of the claims remaining in the application are now clearly allowable. 
Favorable consideration and a Notice of Allowance are earnestly solicited. 
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